118 research outputs found

    Understanding concurrent earcons: applying auditory scene analysis principles to concurrent earcon recognition

    Get PDF
    Two investigations into the identification of concurrently presented, structured sounds, called earcons were carried out. One of the experiments investigated how varying the number of concurrently presented earcons affected their identification. It was found that varying the number had a significant effect on the proportion of earcons identified. Reducing the number of concurrently presented earcons lead to a general increase in the proportion of presented earcons successfully identified. The second experiment investigated how modifying the earcons and their presentation, using techniques influenced by auditory scene analysis, affected earcon identification. It was found that both modifying the earcons such that each was presented with a unique timbre, and altering their presentation such that there was a 300 ms onset-to-onset time delay between each earcon were found to significantly increase identification. Guidelines were drawn from this work to assist future interface designers when incorporating concurrently presented earcons

    Optimizing the spatial configuration of a seven-talker speech display

    Get PDF
    Proceedings of the 9th International Conference on Auditory Display (ICAD), Boston, MA, July 7-9, 2003.Although there is substantial evidence that performance in multitalker listening tasks can be improved by spatially separating the apparent locations of the competing talkers, very little effort has been made to determine the best locations and presentation levels for the talkers in a multichannel speech display. In this experiment, a call-sign based color and number identification task was used to evaluate the effectiveness of three different spatial configurations and two different level normalization schemes in a sevenchannel binaural speech display. When only two spatially-adjacent channels of the seven-channel system were active, overall performance was substantially better with a geometrically-spaced spatial configuration (with far-field talkers at -90 , -30 , -10 , 0 , +10 , +30 , and +90 azimuth) or a hybrid near-far configuration (with far-field talkers at -90 , -30 , 0 , +30 , and +90 azimuth and near-field talkers at 90 ) than with a more conventional linearlyspaced configuration (with far-field talkers at -90 , -60 , -30 , 0 , +30 , +60 , and +90 azimuth). When all seven channels were active, performance was generally better with a ``better-ear'' normalization scheme that equalized the levels of the talkers in the more intense ear than with a default normalization scheme that equalized the levels of the talkers at the center of the head. The best overall performance in the seven-talker task occurred when the hybrid near-far spatial configuration was combined with the better-ear normalization scheme. This combination resulted in a 20% increase in the number of correct identifications relative to the baseline condition with linearly-spaced talker locations and no level normalization. Although this is a relatively modest improvement, it should be noted that it could be achieved at little or no cost simply by reconfiguring the HRTFs used in a multitalker speech display

    Detection and localization of speech in the presence of competing speech signals

    Get PDF
    Presented at the 12th International Conference on Auditory Display (ICAD), London, UK, June 20-23, 2006.Auditory displays are often used to convey important information in complex operational environments. One problem with these displays is that potentially critical information can be corrupted or lost when multiple warning sounds are presented at the same time. In this experiment, we examined a listener's ability to detect and localize a target speech token in the presence of from 1 to 5 simultaneous competing speech tokens. Two conditions were examined: a condition in which all of the speech tokens were presented from the same location (the `co-located' condition) and a condition in which the speech tokens were presented from different random locations (the `spatially separated' condition). The results suggest that both detection and localization degrade as the number of competing sounds increases. However, the changes in detection performance were found to be surprisingly small and there appeared to be little or no benefit of spatial separation for detection. Localization, on the other hand, was found to degrade substantially and systematically as the number of competing speech tokens increased. Overall, these results suggest that listeners are able to extract substantial information from these speech tokens even when the target is presented with 5 competing simultaneous sounds

    Flying by Ear: Blind Flight with a Music-Based Artificial Horizon

    Get PDF
    Two experiments were conducted in actual flight operations to evaluate an audio artificial horizon display that imposed aircraft attitude information on pilot-selected music. The first experiment examined a pilot's ability to identify, with vision obscured, a change in aircraft roll or pitch, with and without the audio artificial horizon display. The results suggest that the audio horizon display improves the accuracy of attitude identification overall, but differentially affects response time across conditions. In the second experiment, subject pilots performed recoveries from displaced aircraft attitudes using either standard visual instruments, or, with vision obscured, the audio artificial horizon display. The results suggest that subjects were able to maneuver the aircraft to within its safety envelope. Overall, pilots were able to benefit from the display, suggesting that such a display could help to improve overall safety in general aviation

    Acoustic Cues for Sound Source Distance and Azimuth in Rabbits, a Racquetball and a Rigid Spherical Model

    Get PDF
    There are numerous studies measuring the transfer functions representing signal transformation between a source and each ear canal, i.e., the head-related transfer functions (HRTFs), for various species. However, only a handful of these address the effects of sound source distance on HRTFs. This is the first study of HRTFs in the rabbit where the emphasis is on the effects of sound source distance and azimuth on HRTFs. With the rabbit placed in an anechoic chamber, we made acoustic measurements with miniature microphones placed deep in each ear canal to a sound source at different positions (10–160 cm distance, ±150° azimuth). The sound was a logarithmically swept broadband chirp. For comparisons, we also obtained the HRTFs from a racquetball and a computational model for a rigid sphere. We found that (1) the spectral shape of the HRTF in each ear changed with sound source location; (2) interaural level difference (ILD) increased with decreasing distance and with increasing frequency. Furthermore, ILDs can be substantial even at low frequencies when distance is close; and (3) interaural time difference (ITD) decreased with decreasing distance and generally increased with decreasing frequency. The observations in the rabbit were reproduced, in general, by those in the racquetball, albeit greater in magnitude in the rabbit. In the sphere model, the results were partly similar and partly different than those in the racquetball and the rabbit. These findings refute the common notions that ILD is negligible at low frequencies and that ITD is constant across frequency. These misconceptions became evident when distance-dependent changes were examined

    Judging Time-to-Passage of looming sounds: evidence for the use of distance-based information

    Get PDF
    Perceptual judgments are an essential mechanism for our everyday interaction with other moving agents or events. For instance, estimation of the time remaining before an object contacts or passes us is essential to act upon or to avoid that object. Previous studies have demonstrated that participants use different cues to estimate the time to contact or the time to passage of approaching visual stimuli. Despite the considerable number of studies on the judgment of approaching auditory stimuli, not much is known about the cues that guide listeners’ performance in an auditory Time-to-Passage (TTP) task. The present study evaluates how accurately participants judge approaching white-noise stimuli in a TTP task that included variable occlusion periods (portion of the presentation time where the stimulus is not audible). Results showed that participants were able to accurately estimate TTP and their performance, in general, was weakly affected by occlusion periods. Moreover, we looked into the psychoacoustic variables provided by the stimuli and analysed how binaural cues related with the performance obtained in the psychophysical task. The binaural temporal difference seems to be the psychoacoustic cue guiding participants’ performance for lower amounts of occlusion, while the binaural loudness difference seems to be the cue guiding performance for higher amounts of occlusion. These results allowed us to explain the perceptual strategies used by participants in a TTP task (maintaining accuracy by shifting the informative cue for TTP estimation), and to demonstrate that the psychoacoustic cue guiding listeners’ performance changes according to the occlusion period.This study was supported by: Bial FoundationGrant 143/14 (https://www.bial.com/en/bial_foundation.11/11th_symposium.219/ fellows_preliminary_results.235/fellows_ preliminary_results.a569.html); FCT PTDC/EEAELC/112137/2009 (https://www.fct.pt/apoios/projectos/consulta/vglobal_projecto?idProjecto=112137&idElemConcurso=3628); and COMPETE: POCI-01-0145-FEDER-007043 and FCT – Fundação para a Ciência e Tecnologia within the Project Scope: UID/CEC/00319/2013.info:eu-repo/semantics/publishedVersio

    Cueing listeners to attend to a target talker progressively improves word report as the duration of the cue-target interval lengthens to 2000 ms

    Get PDF
    Endogenous attention is typically studied by presenting instructive cues in advance of a target stimulus array. For endogenous visual attention, task performance improves as the duration of the cue-target interval increases up to 800 ms. Less is known about how endogenous auditory attention unfolds over time or the mechanisms by which an instructive cue presented in advance of an auditory array improves performance. The current experiment used five cue-target intervals (0, 250, 500, 1000, and 2000 ms) to compare four hypotheses for how preparatory attention develops over time in a multi-talker listening task. Young adults were cued to attend to a target talker who spoke in a mixture of three talkers. Visual cues indicated the target talker’s spatial location or their gender. Participants directed attention to location and gender simultaneously (‘objects’) at all cue-target intervals. Participants were consistently faster and more accurate at reporting words spoken by the target talker when the cue-target interval was 2000 ms than 0 ms. In addition, the latency of correct responses progressively shortened as the duration of the cue-target interval increased from 0 to 2000 ms. These findings suggest that the mechanisms involved in preparatory auditory attention develop gradually over time, taking at least 2000 ms to reach optimal configuration, yet providing cumulative improvements in speech intelligibility as the duration of the cue-target interval increases from 0 to 2000 ms. These results demonstrate an improvement in performance for cue-target intervals longer than those that have been reported previously in the visual or auditory modalities

    Auditory spatial representations of the world are compressed in blind humans

    Get PDF
    Compared to sighted listeners, blind listeners often display enhanced auditory spatial abilities such as localization in azimuth. However, less is known about whether blind humans can accurately judge distance in extrapersonal space using auditory cues alone. Using virtualization techniques, we show that auditory spatial representations of the world beyond the peripersonal space of blind listeners are compressed compared to those for normally sighted controls. Blind participants overestimated the distance to nearby sources, and underestimated the distance to remote sound sources, in both reverberant and anechoic environments, and for speech, music and noise signals. Functions relating judged and actual virtual distance were well fitted by compressive power functions, indicating that the absence of visual information regarding the distance of sound sources may prevent accurate calibration of the distance information provided by auditory signals
    corecore